CHOP proteins into structural domain-like fragments.
نویسندگان
چکیده
We developed a method CHOP dissecting proteins into domain-like fragments. The basic idea was to cut proteins beginning from very reliable experimental information (PDB), proceeding to expert annotations of domain-like regions (Pfam-A), and completing through cuts based on termini of known proteins. In this way, CHOP dissected more than two thirds of all proteins from 62 proteomes. Analysis of our structural domain-like fragments revealed four surprising results. First, >70% of all dissected proteins contained more than one fragment. Second, most domains spanned on average over approximately 100 residues. This average was similar for eukaryotic and prokaryotic proteins, and it is also valid-although previously not described-for all proteins in the PDB. Third, single-domain proteins were significant longer than most domains in multidomain proteins. Fourth, three fourths of all domains appeared shorter than 210 residues. We believe that our CHOP fragments constituted an important resource for functional and structural genomics. Nevertheless, our main motivation to develop CHOP was that the single-linkage clustering method failed to adequately group full-length proteins. In contrast, CLUP-the simple clustering scheme CLUP introduced here-succeeded largely to group the CHOP fragments from 62 proteomes such that all members of one cluster shared a basic structural core. CLUP found >63,000 multi- and >118,000 single-member clusters. Although most fragments were restricted to a particular cluster, approximately 24% of the fragments were duplicated in at least two clusters. Our thresholds for grouping two fragments into the same cluster were rather conservative. Nevertheless, our results suggested that structural genomics initiatives have to target >30,000 fragments to at least cover the multimember clusters in 62 proteomes.
منابع مشابه
CHOP: parsing proteins into structural domains
Sequence-based domain assignment is one of the most important and challenging problems in structural biology. We have developed a method, CHOP, that chops proteins into domain-like fragments. The basic idea is to cut proteins from entirely sequenced organisms beginning from very reliable experimental information (Protein Data Bank), proceeding to expert annotations of domain-like regions (Pfam-...
متن کاملAutomatic target selection for structural genomics on eukaryotes.
A central goal of structural genomics is to experimentally determine representative structures for all protein families. At least 14 structural genomics pilot projects are currently investigating the feasibility of high-throughput structure determination; the National Institutes of Health funded nine of these in the United States. Initiatives differ in the particular subset of "all families" on...
متن کاملA novel effector domain from the RNA-binding protein TLS or EWS is required for oncogenic transformation by CHOP.
In human myxoid liposarcoma, a chromosomal rearrangement leads to fusion of the growth-arresting and DNA-damage-inducible transcription factor CHOP (GADD153) to a peptide fragment encoded by the TLS gene. We have found that wild-type TLS and a closely related sarcoma-associated protein, EWS, are both abundant nuclear proteins that associate in vivo with products of RNA polymerase II transcripti...
متن کاملCHANGES OF PERK AND CHOP PROTEINS IN ENDOPLASMIC RETICULUM OF CARDIAC MYOCYTES AND TNF IN DIABETIC WISTAR RATS FOLLOWING CONTINUOUS AND INTERVAL EXERCISE
Background: Physical activity plays a major role in the prevention of cardiovascular disease and diabetes, but the effect of intense activity on endoplasmic reticulum proteins and apoptosis and necroptosis in diabetic conditions is unclear. The aim of the present study was to investigate the changes of PERK and CHOP proteins in endoplasmic reticulum of cardiac myocytes of diabetic Wistar rats f...
متن کاملThe roles of EPIYA sequence to perturb the cellular signaling pathways and cancer risk
Abstract It was shown that several pathogenic bacterial effector proteins contain the Glu-Pro-Ile-Tyr-Ala (EPIYA) or a similar sequence. These bacterial EPIYA effectors are delivered into host cell via type III or IV secretion system, where they undergo tyrosine phosphorylation at the EPIYA sequences, which triggers interaction with multiple host cell SH2 domain-containing proteins and thereby...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proteins
دوره 55 3 شماره
صفحات -
تاریخ انتشار 2004